blasting Sigenae 8
Type of output
-------
Plan take each fosmid…
assemble (Example Settings)
(Need to confirm all assemblies
consensus >20x coverage and 5000 bp and get non redundant dataset.
Single fasta (~45k) generated
Using Local cd-hit-est
Seemed to work
Also running blast on itself (the redundant file)
ReRunning cd-est hit with more stringency.
robertsmac:cd-hit-v4.5.4-2011-03-07 sr320$ ./cd-hit-est -i /Volumes/Bay4\ scratch/Combined_fasta_fosmids.fa -o /Volumes/Bay4\ scratch/Combined_fosmid_cd-est_V2 -c 0.95 -B 1
================================================================
Program: CD-HIT, V4.5.4, Feb 23 2012, 11:03:06
Command: ./cd-hit-est -i
/Volumes/Bay4 scratch/Combined_fasta_fosmids.fa -o
/Volumes/Bay4 scratch/Combined_fosmid_cd-est_V2 -c
0.95 -B 1
Started: Thu Feb 23 13:34:05 2012
================================================================
Output
----------------------------------------------------------------
total seq: 48409
longest and shortest : 45743 and 5001
Total letters: 369548689
Sequences have been sorted
Approximated minimal memory consumption:
Sequence : 5M
Buffer : 1 X 22M = 22M
Table : 1 X 17M = 17M
Miscellaneous : 4M
Total : 50M
Table limit with the given memory limit:
Max number of representatives: 4194304
Max number of word counting entries: 93673308
CHANGED NOTHING
again
-----
robertsmac:cd-hit-v4.5.4-2011-03-07 sr320$ ./cd-hit-est -i /Volumes/Bay4\ scratch/Combined_fasta_fosmids.fa -o /Volumes/Bay4\ scratch/Combined_fosmid_cd-est_V3
-c 0.80 -B 1================================================================
Program: CD-HIT, V4.5.4, Feb 23 2012, 11:03:06
Command: ./cd-hit-est -i
/Volumes/Bay4 scratch/Combined_fasta_fosmids.fa -o
/Volumes/Bay4 scratch/Combined_fosmid_cd-est_V3 -c
0.80 -B 1
Started: Thu Feb 23 14:32:03 2012
================================================================
Output
----------------------------------------------------------------
total seq: 48409
longest and shortest : 45743 and 5001
Total letters: 369548689
Sequences have been sorted
Approximated minimal memory consumption:
Sequence : 5M
Buffer : 1 X 22M = 22M
Table : 1 X 17M = 17M
Miscellaneous : 4M
Total : 50M
Table limit with the given memory limit:
Max number of representatives: 4194304
Max number of word counting entries: 93673308
#fail
-
USER GUIDE
and again
robertsmac:cd-hit-v4.5.4-2011-03-07 sr320$ ./cd-hit-est -i /Volumes/Bay4\ scratch/Combined_fasta_fosmids.fa -o /Volumes/Bay4\ scratch/Combined_fosmid_cd-est_V5 -c 0.88 -n 7 -M 0
extremely slow
#fail
going to try original again
./cd-hit-est -i /Volumes/Bay4\ scratch/Combined_fasta_fosmids.fa -o /Volumes/Bay4\ scratch/Combined_fosmid_cd-est_V6
worked just fine.
Stayed with original CD-hit (default)